19 research outputs found
A computational framework for evaluating outcomes in infant craniosynostosis reconstruction
Historically, surgical outcomes in craniosynostosis have been evaluated by qualitative analysis, direct and indirect anthropometry, cephalometrics, and CT craniometric analysis.
Three-dimensional meshes constructed from 3dMD images acquired on patients with synostosis at multiple times across the course of surgical treatment provide ideal raw data for a novel approach to 3D geometric shape analysis of surgical results.
We design a automatic computational framework for evaluating and visualizing the results of infant cranial surgeries based on 3dMD images. The goal of this framework is to assist surgeons in evaluating the efficacy of their surgical techniques. Feedback from surgeons in Texas Children's Hospital confirms that this framework is a robust computational system within which surgical outcomes in synostosis can be accurately and meaningfully evaluated.
We also propose an algorithm to generate normative infant cranial models from the input of 3D meshes, which are extracted from CT scans of normal infant skulls. Comparing of the head shape of an affected subject with a normal control will more clearly illustrate in what aspect the subject's head deviates from the norm. Comparing of a post-treatment subject's head shape and an age-matched control would allow assessing of a specific treatment approach or surgical technique
PlinyCompute: A Platform for High-Performance, Distributed, Data-Intensive Tool Development
This paper describes PlinyCompute, a system for development of
high-performance, data-intensive, distributed computing tools and libraries. In
the large, PlinyCompute presents the programmer with a very high-level,
declarative interface, relying on automatic, relational-database style
optimization to figure out how to stage distributed computations. However, in
the small, PlinyCompute presents the capable systems programmer with a
persistent object data model and API (the "PC object model") and associated
memory management system that has been designed from the ground-up for high
performance, distributed, data-intensive computing. This contrasts with most
other Big Data systems, which are constructed on top of the Java Virtual
Machine (JVM), and hence must at least partially cede performance-critical
concerns such as memory management (including layout and de/allocation) and
virtual method/function dispatch to the JVM. This hybrid approach---declarative
in the large, trusting the programmer's ability to utilize PC object model
efficiently in the small---results in a system that is ideal for the
development of reusable, data-intensive tools and libraries. Through extensive
benchmarking, we show that implementing complex objects manipulation and
non-trivial, library-style computations on top of PlinyCompute can result in a
speedup of 2x to more than 50x or more compared to equivalent implementations
on Spark.Comment: 48 pages, including references and Appendi
Auto-Differentiation of Relational Computations for Very Large Scale Machine Learning
The relational data model was designed to facilitate large-scale data
management and analytics. We consider the problem of how to differentiate
computations expressed relationally. We show experimentally that a relational
engine running an auto-differentiated relational algorithm can easily scale to
very large datasets, and is competitive with state-of-the-art, special-purpose
systems for large-scale distributed machine learning.Comment: ICML 202
Serving Deep Learning Model in Relational Databases
Serving deep learning (DL) models on relational data has become a critical
requirement across diverse commercial and scientific domains, sparking growing
interest recently. In this visionary paper, we embark on a comprehensive
exploration of representative architectures to address the requirement. We
highlight three pivotal paradigms: The state-of-the-artDL-Centricarchitecture
offloadsDL computations to dedicated DL frameworks. The potential UDF-Centric
architecture encapsulates one or more tensor computations into User Defined
Functions (UDFs) within the database system. The
potentialRelation-Centricarchitecture aims to represent a large-scale tensor
computation through relational operators. While each of these architectures
demonstrates promise in specific use scenarios, we identify urgent requirements
for seamless integration of these architectures and the middle ground between
these architectures. We delve into the gaps that impede the integration and
explore innovative strategies to close them. We present a pathway to establish
a novel database system for enabling a broad class of data-intensive DL
inference applications.Comment: Authors are ordered alphabetically; Jia Zou is the corresponding
autho
Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
Large language models (LLMs) with hundreds of billions of parameters have
sparked a new wave of exciting AI applications. However, they are
computationally expensive at inference time. Sparsity is a natural approach to
reduce this cost, but existing methods either require costly retraining, have
to forgo LLM's in-context learning ability, or do not yield wall-clock time
speedup on modern hardware. We hypothesize that contextual sparsity, which are
small, input-dependent sets of attention heads and MLP parameters that yield
approximately the same output as the dense model for a given input, can address
these issues. We show that contextual sparsity exists, that it can be
accurately predicted, and that we can exploit it to speed up LLM inference in
wall-clock time without compromising LLM's quality or in-context learning
ability. Based on these insights, we propose DejaVu, a system that uses a
low-cost algorithm to predict contextual sparsity on the fly given inputs to
each layer, along with an asynchronous and hardware-aware implementation that
speeds up LLM inference. We validate that DejaVu can reduce the inference
latency of OPT-175B by over 2X compared to the state-of-the-art
FasterTransformer, and over 6X compared to the widely used Hugging Face
implementation, without compromising model quality. The code is available at
https://github.com/FMInference/DejaVu